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Abstract 

Experience  with  solving  a  12,753,313  variable  linear  program  is  described.  This  problem 
is  the  linear  programming  relaxation  of  a  set  partitioning  problem  arising  from  an  airline 
crew  scheduling  application.  A  scheme  is  described  that  requires  successive  solutions  of 
small  subproblems,  yielding  a  procedure  that  has  little  growth  in  solution  time  in  terms  of 
the  number  of  variables.  Experience  using  the  simplex  method  as  implemented  in  CPLEX, 
an  interior  point  method  as  implemented  in  OBI,  and  a  hybrid  interior  point/simplex  ap¬ 
proach  is  reported.  The  resulting  procedure  illustrates  the  power  of  an  interior  point/simplex 
combination  for  solving  very  large-scale  linear  programs. 
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1  Introduction 

Recent  improvements  in  implementations  of  the  simplex  method  as  well  as  developments  in 
interior  point  methods  have  changed  our  concept  of  large-scale  linear  programming.  In  this 
study,  experience  in  solving  the  linear  programming  relaxation  of  a  large  set  partitioning 
problem  on  a  CRAY  Y-MP1  supercomputer  is  reported.  The  linear  program  has  837  rows 
with  12,753,313  columns  and  99,845,826  nonzeros  and  arises  from  airline  crew  scheduling. 
Using  a  number  of  techniques,  the  solution  time  is  reduced  to  approximately  4  minutes  with 
a  3.5  minute  preprocessing  time. 

The  basic  procedure  (called  sifting )  is  a  kind  of  “column  generation”  method.  Its  appli¬ 
cation  for  airline  crew  scheduling  was  proposed  by  John  Forrest  [4],  but  in  principle,  it  can  be 
applied  to  many  problem  classes.  Computational  experience  is  reported  for  this  procedure 
using  two  different  methods  to  solve  the  linear  programs  that  are  generated:  CPLEX,2  an 
implementation  of  the  simplex  method  by  Bixby  [3],  and  OBI,  an  implementation  of  the 
primal-dual  predictor-corrector  interior  point  method  by  Lustig,  Marsten  and  Shanno  [9].  In 
addition,  a  hybrid  interior  point-simplex  implementation  is  explored. 

In  Section  2,  the  characteristics  of  the  set  partitioning  problems  are  discussed  and  in 
Section  3,  the  sifting  procedure  is  described.  Section  3.2  describes  a  new  pricing  rule.  Sec¬ 
tions  4  and  5  describe  our  experience  with  the  sifting  procedure  using  the  simplex  method 
and  an  interior  point  method,  respectively.  This  experience  led  to  the  development  of  a  hy¬ 
brid  scheme,  using  both  the  interior  point  method  and  the  simplex  method.  It  is  described 
in  Section  6.  This  hybrid  scheme  illustrates  how  interior  point  methods  and  the  simplex 
method  can  be  effectively  combined,  exploiting  the  relative  advantages  of  each  method. 

2  A  Set  Partitioning  Problem 

Set  partitioning  models  have  the  form  [14]: 

...  x 

minimize  c  x 

subject  to  Ax  =  e,  (1) 

x  >  0, 
x  integer. 

where  A  £  {0,  l}mxn,  e  =  (1,1,...,  1)T  £  3ftm,  and  c  £  9ft".  In  the  problems  studied,  c  >  0. 
The  impetus  for  this  work  was  the  “American  Airlines  Challenge.”  American  Airlines  uses  a 
set  partitioning  model  of  the  flight  crew  scheduling  problem  [6].  (For  a  recent  discussion  of 
crew  scheduling,  see  Barutt  and  Hull  [2].)  In  order  to  stimulate  fresh  thinking  on  the  solution 
of  such  models,  American  generated  an  instance  with  837  rows  and  approximately  12,750,000 
columns.  They  then  made  the  data  available  to  researchers.  The  ultimate  challenge  is  to 
obtain  an  optimal  integer  solution.  As  a  step  toward  that  goal,  the  sifting  procedure  was 

1CRAY  and  CRAY  Y-MP  are  trademarks  of  Cray  Research,  Inc. 

2CPLEX  is  a  trademark  of  CPLEX  Optimization  Inc. 
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Number  of 
Columns  h 

Nonduplicate 

Columns 

Method 

Iterations 

CRAY  CPU 

Time  (secs) 

25,000 

17,937 

CPLEX  Dual  SE 

302 

1.4 

25,000 

17,937 

CPLEX  Primal  SE 

1768 

22.6 

25,000 

17,937 

OBI 

16 

17.3 

50,000 

35,331 

CPLEX  Dual  SE 

509 

4.8 

50,000 

35,331 

CPLEX  Primal  SE 

8233 

179.7 

50,000 

35,331 

OBI 

18 

22.9 

100,000 

68,428 

CPLEX  Dual  SE 

1215 

27.3 

100,000 

68,428 

OBI 

21 

34.1 

200,000 

134,556 

CPLEX  Dual  SE 

2549 

116.6 

200,000 

134,556 

OBI 

30 

67.7 

Table  1:  Direct  solution  of  problems  with  CPLEX  and  OBI  (SE  is  Steepest 
Edge).  Problems  are  generated  by  taking  the  first  h  columns  from  the  input 
file,  where  h  =  (25000, 50000, 100000, 200000). 

developed  and  used  to  solve  the  associated  linear  programming  relaxation  (obtained  by 
dropping  integrality  in  (1)). 

Airline  crew  scheduling  problems  exhibit  a  number  of  interesting  characteristics.  Due 
to  the  way  the  columns  are  generated  [6],  the  matrix  A  contains  many  duplicate  columns. 
Duplicate  columns  are  columns  that  have  nonzeros  in  exactly  the  same  rows,  but  possibly 
different  cost  coefficients.  Eliminating  all  but  one  of  these  columns  can  significantly  reduce 
the  size  of  the  problem  without  affecting  the  optimal  solution. 

In  typical  airline  crew  scheduling  problems  there  are  an  average  of  about  10  ones  per 
column,  with  a  maximum  of  18  in  the  instance  provided  by  American.  The  cost  coefficients 
c  are  highly  variable  (in  the  range  from  zero  to  10,000  in  the  instance  provided).  On  the 
other  hand,  the  right-hand-side  is  constant.  Thus,  (1)  is  typically  highly  degenerate,  but  its 
dual  is  not.  This  fact  plays  a  significant  role  in  the  solution  strategies  used. 

Primarily  because  of  degeneracy,  but  also  because  of  slow  convergence  even  when  objec¬ 
tive  function  progress  is  being  made,  relaxations  of  set  partitioning  problems  have  long  been 
considered  difficult.  Even  small  numbers  of  columns  can  result  in  a  problem  that  is  trouble¬ 
some  for  standard  implementations  of  the  primal  simplex  method.  Recent  developments  of 
dual  simplex  methods  and  interior  point  methods  have  largely  eliminated  that  difficulty,  at 
least  for  moderate  sized  instances.  As  Table  2  shows,  these  approaches  achieve  substantial 
improvements  over  even  a  rather  strong  version  of  primal  simplex  using  steepest-edge  pric¬ 
ing.  However,  the  growth  of  solution  times  in  Table  2  with  problem  size  also  demonstrates 
quite  clearly  that  direct  application  of  even  these  latter  approaches  is  not  sufficient  to  deal 
with  the  full  12.75  million  variable  problem.  This  is  especially  true  since  solving  the  integer 
program  will  require  multiple  solutions  of  the  linear  programming  relaxation. 

Fortunately,  set-partitioning  models  have  an  important  property  that  allows  them  to  be 
solved  by  non-direct  methods.  Let  FV  C  {l,...,n}  be  a  subset  of  columns  such  that  the 
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linear  program 

A^Xy^  —  e,  (2) 

X  yy  ^  0, 

is  feasible.  Then  this  problem  is  an  instance  of  (1),  since  columns  in  Aw  have  the  same 
structure  as  all  columns  in  A.  It  is  this  property  along  with  the  fact  that  n  is  significantly 
larger  than  m  that  makes  “sifting”  natural  for  (1),  and  also  makes  sifting  natural  for  a 
number  of  other  problems  such  as  the  traveling  salesman  problem  and  path  formulations  of 
appropriate  multicommodity  flow  problems. 

3  Sifting 

As  noted  in  the  introduction,  the  sifting  procedure  applied  here  was  suggested  by  Forrest  [4], 
who  termed  it  sprint. 

Consider  the  general  linear  program 

minimize  cTx 

subject  to  Ax  =  6,  (3) 

x  >  0, 

with  A  €  Mmxn,  cer,6e  The  dual  of  (3)  is 

maximize  bTy 

subject  to  ATy  <  c.  (4) 

The  sifting  procedure  begins  by  taking  a  “working  set”  of  columns  W  C  {1, . . .  ,n}  such 

that 

minimize  c^xw 

subject  to  Awxw  —  b,  (5) 

x  w  hi  0, 

is  feasible.  (This  assumption  is  not  essential.)  Let  7r*  be  an  optimal  solution  to 

maximize  bTw 

subject  to  A^n  <  cw,  (6) 

the  dual  of  (5),  and  let  x*w  be  an  optimal  solution  of  (5).  Then  the  vector  xT  —  ((x^)r,0)  € 
3ft”  is  optimal  for  (3)  if 

c  -  A  V  >  0.  (7) 

The  sifting  procedure  suggested  by  these  observations  may  be  summarized  as  follows: 


minimize 
subject  to 
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Given  the  linear  program  (3)  and  a  set  W  such  that  (5) 

Solve  (5)  obtaining  x*  and  7r*. 
while  (c  —  AttA  jt  0)  do 

Choose  V  C  {1, . . .  ,  n}  \  W. 

Set  W  <-  W  U  V. 

(Optionally)  If  W  is  too  big, 
reduce  the  size  of  W. 

Solve  (5)  obtaining  x*  and  7 r*. 
end  while 

In  the  following  sections,  the  key  ideas  in  implementing  the  above  procedure  are  discussed: 

1.  The  efficient  computation  of  Attt*  given  7r*  is  discussed  in  Section  3.1. 

2.  Choosing  a  good  set  of  candidates  for  V  is  discussed  in  Section  3.2,  where  a  new  method 
for  normalizing  reduced  costs  is  introduced. 

3.  Controlling  the  size  of  W  via  control  of  the  size  of  V  and  purges  of  W  is  discussed  in 
Section  3.3. 

4.  Finally,  the  specific  algorithms  used  to  solve  (5)  are  treated.  That  discussion  is  broken 
into  three  parts,  the  simplex  method  (Section  4),  an  interior  point  method  (Section  5), 
and  a  hybrid  interior  point /simplex  approach  (Section  6).  These  three  sections  are 
preceded  by  a  discussion  (Section  3.4)  of  a  number  of  issues  applicable  to  each  of  the 
three  solution  methods. 

3.1  Vectorization  of  the  Pricing  Operation 

For  each  major  iteration  of  the  sifting  procedure,  the  values  AT k*  are  computed.  However, 
on  a  vector  supercomputer,  the  usual  direct  procedure  to  compute  A1 7r*  can  exhibit  poor 
performance  if  most  of  the  columns  in  A  are  short.  To  get  effective  vectorization,  the 
algorithm  described  by  Forrest  and  Tomlin  [5]  is  used. 

Let  tj  represent  the  number  of  nonzeros  in  column  j.  Since  £j  <  m,  a  simple  bucket 
sort  (using  0{n)  time)  can  be  used  to  sort  the  columns  according  to  increasing  column 
length.  Given  this  ordering,  let  ti?  represent  the  number  of  columns  with  l  nonzeros  and  let 
U  represent  the  first  column  (in  the  new  ordering)  with  £  nonzeros.  Then  it  follows  that 
columns  jt,jt  +  1, . . .  ,je  +  nt  —  1  all  have  the  same  column  length.  To  compute  dj  —  Aj k*, 
f°r  jt  <  j  <  jt  +  nt  —  1  and  a  fixed  value  of  £,  the  following  procedure  is  used,  where  ikj 
represents  the  row  number  of  the  kth  nonzero  in  column  j: 

for  k  =  1  to  /  do 

for  j  =  j(  to  jt  +  m  -  1  do 
dj  =  dj  +  t rikAikjj 


is  feasible: 

(major  iteration) 
(price) 
(augment  problem) 

(purge) 

(solve) 
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This  loop  is  repeated  for  each  value  of  i.  The  loop  vectorizes  well  if  the  nonzeros  of  each 
column  and  the  row  indices  ikj  are  stored  consecutively.  This  happens  when  the  values 
ri£  are  large  as  compared  to  the  maximum  value  of  ij.  This  procedure  was  found  to  be 
approximately  ten  times  faster  than  the  usual  direct  procedure  for  computing  dj  by 

b 

^3  =  7r ikj  Aik] 3 
k= 1 

The  inner  loop  of  the  new  procedure  is  making  a  calculation  with  a  large  vector  of  aver¬ 
age  length  re^/10,  as  opposed  to  the  small  vectors  of  average  length  10  used  in  the  direct 
procedure. 


(8) 


3.2  A  New  Pricing  Rule 

The  initial  experiments  with  solving  the  full  problem  used  the  following  procedure  that 
chooses  a  target  number  t  of  columns  j  with  small  reduced  cost  Cj  —  AJtt*: 

procedure  standard_price: 

Step  1.  Compute  c  —  c  —  Attt*  . 

Step  2.  Find  V  =  { j  :  Cj  <  0}. 

Step  3.  If  V  has  more  than  t  columns,  then  sort  (in  ascending  order)  the 
values  Cj  for  j  €  V  to  find  the  t  columns  to  keep  in  V  with  the  most 
negative  values  of  Cj. 

Step  1  was  carried  out  as  described  in  Section  3.1.  Step  2  used  the  CRAY  subroutine  WHENFLE 
and  the  sort  in  step  3  was  done  by  using  the  CRAY  ORDERS  routine.  The  set  V  determined 
by  standard-price  was  added  to  the  current  working  set  W. 

For  the  particular  set  partitioning  problems  considered,  this  pricing  procedure  produced 
adequate  performance.  However,  in  working  directly  with  small  versions  of  these  problems,  it 
was  observed  that  the  steepest-edge  pricing  rule  [7]  with  the  simplex  method  produced  much 
better  results  than  simply  using  reduced  costs.  However,  the  steepest-edge  rule  is  inherently 
based  on  the  simplex  method.  Our  desire  was  to  use  a  column  selection  strategy  that  is  as 
independent  as  possible  of  the  algorithm  used  to  solve  the  subproblem.  An  analysis  of  the 
problem  yielded  a  new  pricing  rule  that  performed  well  for  these  problems. 

Given  an  optimal  solution  n r*  to  the  dual  (6)  of  (5),  let 


A  —  min 

1  <72 


Ay 


>  0 


(9) 


Then,  since  c  >  0,  it  is  clear  that  A  >  1  if  and  only  if  Cj  =  Cj  -  Ajx*  >  0  for  all  j.  Further, 
if  AJtt*  <  0  for  all  j,  then  y  =  7r*  is  optimal  for  the  dual  problem  (4).  It  follows  from  the 
definition  of  A  that  At(\tt*)  <  c,  so  that  the  vector  A7r*  provides  a  feasible  solution  y  =  A7r* 
to 


maximize  eTy 

subject  to  ATy  <  c, 


(10) 
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Problem 

Name 

Major  Iterations 

standard_price 

lambda_price 

lmil 

12 

9 

2mil 

11 

9 

4mil 

10 

11 

8mil 

15 

12 

13mil 

17 

12 

Table  2:  Major  Iterations  for  2  Pricing  Rules 

the  dual  of  the  linear  programming  relaxation  of  (1).  In  fact,  the  value  AeT7r*  provides  a 
lower  bound  on  the  objective  value  for  the  full  primal  linear  programming  relaxation  of  (1). 
This  bound  may  be  useful  in  other  procedures  that  use  column  generation  on  classes  of  linear 
programs  with  nonnegative  costs. 

Intuitively,  the  constraints  in  (10)  yielding  the  smallest  values  of  Cj/Ajn*  are  most  likely 
to  be  active  at  the  optimal  solution.  This  idea  suggests  the  following  procedure  for  selecting 
a  target  t  number  of  columns: 

procedure  lambda_price: 

Step  1.  Compute  d  ~  ATr* . 

Step  2.  Compute  c  =  c  —  d. 

Step  3.  Find  V  =  {j  :  Cj  <  0}. 

Step  4.  If  V  has  t  or  less  columns,  then  exit. 

Step  5.  Compute  A  j  —  Cj/dj  for  columns  j  E  V. 

Step  6.  Sort  (in  ascending  order)  the  values  A  j  for  j  E  V . 

Step  7.  Set  V  to  the  t  columns  with  the  lowest  values  of  \j. 

Note  that  if  j  E  V,  then  Cj  —  Ajn*  <  0  which  implies  that  Xj  <  1  and  AjV*  ^  0  so 
the  above  steps  are  well  defined.  This  pricing  rule  should  prefer  columns  with  low  absolute 
cost  as  well  as  columns  with  more  nonzeros.  Experimentation  with  the  initial  pricing  rule 
indicated  that  these  columns  should  be  preferred  as  these  types  of  columns  are  predominant 
in  the  optimal  solution. 

As  compared  to  the  standard_price  procedure,  there  is  a  small  amount  of  extra  work 
represented  by  step  5  in  lambda_price.  This  step  vectorizes  well  and  only  added  a  small 
cost  to  the  pricing  procedure.  This  cost  was  small  compared  to  the  savings  generated  by 
the  reduction  in  the  number  of  outer  iterations.  The  new  pricing  procedure  generally  seems 
to  pick  better  columns  for  the  sifting  procedure.  Table  2  indicates  the  number  of  major 
iterations  of  the  sifting  procedure  using  the  two  different  pricing  procedures.  CPLEX  was 
used  to  solve  the  subproblems.  The  different  problems  are  defined  in  Section  3.4. 
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Degeneracy  and  Multiple  Dual  Optimality 

It  is  well  known  that  the  linear  programming  relaxations  of  set  partitioning  problems  exhibit 
massive  primal  degeneracy.  For  the  sifting  procedure,  the  most  important  implication  is  the 
presence  of  multiple  dual  optimal  solutions.  The  particular  choice  of  optimal  dual  solution 
influences  the  selection  of  new  columns. 

If  the  subproblem  (5)  is  solved  by  the  simplex  method,  the  optimal  basis  chosen  is 
essentially  random,  changing  the  qualitative  nature  of  an  optimal  dual  solution.  If  the 
subproblem  (5)  is  solved  by  an  interior  point  method,  the  solution  will  be  in  the  relative 
interior  of  the  face  of  optimal  dual  solutions.  The  particular  dual  solution  7r*  chosen  depends 
on  the  initial  interior  point  used  by  the  solution  algorithm. 

3.3  Problem  Size  Control 

One  of  the  most  difficult  decisions  in  implementing  the  sifting  procedure  is  control  of  the 
cardinality  of  W.  We  chose  to  specify  a  sequence  (U,t2, . . .)  of  target  values,  such  that  the 
cardinaltiy  of  V  is  restricted  to  be  smaller  than  tk  on  the  kth  iteration  of  the  sifting  procedure. 
In  addition,  it  is  important  to  occasionally  purge  columns  from  W  as  an  additional  control 
on  the  size  of  the  subproblems.  The  details  of  these  procedures  for  each  of  the  three  solution 
methods  are  discussed  in  the  appropriate  sections.  Unfortunately,  one  of  the  weaknesses  of 
the  algorithm  as  implemented  is  that  these  sequences  were  determined  by  experimentation 
for  the  specific  problem  being  studied.  However,  once  a  reasonable  sequence  was  determined, 
it  was  applied  unchanged  to  all  problem  instances.  Thus,  although  the  largest  problem  solved 
has  approximately  12.75  million  columns,  we  would  choose  exactly  the  same  sequence  for  a 
problem  of  larger  size. 

It  may  seem  more  natural  to  simply  build  the  subproblems  by  including  in  V  all  columns 
with  negative  reduced  cost.  That  approach  is  simply  impractical.  For  example,  10  percent 
of  the  columns  have  negative  reduced  costs  on  the  first  iteration. 

3.4  Engineering  the  Problem 

Our  experiments  were  run  on  one  processor  of  a  CRAY  Y-MP  computer  with  128  megawords 
of  memory  running  UNICOS  6.0.11  with  the  cft77  Fortran  compiler  version  5.0.0  and  the 
see  C  compiler  version  3.0.0.  At  most  28  megawords  were  required  to  solve  any  of  the 
problems.  In  attempting  to  solve  such  a  linear  program,  a  number  of  tasks  are  required  to 
prepare  the  problem  for  solution. 

The  data  was  provided  in  7  ASCII  (machine  independent  format)  files.  Each  line  of  each 
file  corresponds  to  one  column  in  the  constraint  matrix.  A  simple  preprocessor  was  used 
to  read  in  this  file  in  250,000  column  sections  and  write  out  51  binary  files  consisting  of 
a  slightly  modified  image  of  the  original  data.  Because  of  the  nature  of  the  ASCII  files, 
39  sections  had  exactly  250,000  columns,  10  sections  had  250,302  columns,  1  section  had 
249,998  columns  and  1  section  had  250,297  columns.  This  process  of  converting  the  data  to 
a  binary  format  is  essentially  a  required  input  step.  Moreover,  it  is  reasonable  to  assume 
that  the  data  could  be  provided  in  this  binary  format  initially.  In  fact,  the  time  to  read 
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Problem 

Name 

Original 

Columns 

Nonduplicate 

Columns 

Number  of 
Sections  s 

Optimal 

Objective 

lmil 

1,000,000 

673,310 

4 

54325.363 

2mil 

1,999,998 

1,341,904 

8 

51725.598 

4mil 

3,999,998 

2,682,175 

16 

50413.578 

8mil 

7,999,998 

5,374,005 

32 

48994.020 

13mil 

12,753,313 

8,549,879 

51 

48400.129 

Table  3:  Problem  Statistics 

in  the  7  ASCII  files  (approximately  1400  CRAY  Y-MP  CPU  seconds)  exceeds  the  solution 
time!  The  particular  format  of  these  binary  files  stored  each  section  of  data  as  3  vectors,  a 
cost  vector,  the  vector  t  of  nonzero  counts  per  column,  and  a  vector  of  the  indices  of  rows 
with  nonzeros.  These  three  vectors  are  written  to  each  file  as  separate  binary  vectors  in 
order  to  enhance  vectorization. 

For  the  problem  that  was  provided,  the  first  1000  columns  constitute  an  initial  feasible 
solution.  This  set  of  columns  was  used  as  the  initial  W  for  all  solution  methods.  The 
performance  of  the  sifting  method  on  this  model  is  very  insensitive  to  the  size  of  this  initial 
feasible  set. 

To  reduce  the  cost  of  computing  AT/k*  on  each  iteration,  duplicate  columns  were  removed 
from  each  of  the  51  sections.  Let  S  C  {1, . . .  ,n}  denote  the  columns  within  some  particular 
section.  Given  S ,  all  of  the  nonduplicate  columns  in  S  are  found  before  sorting  the  columns 
of  S  by  the  number  of  nonzeros  per  column,  as  described  in  Section  3.1.  Note  that  this  step 
does  not  remove  all  nonduplicates  from  A,  as  duplicate  columns  may  still  exist  across  section 
boundaries.  Because  of  this,  duplicate  columns  were  eliminated  among  W  on  each  major 
iteration  of  the  sifting  procedure  before  the  subproblem  (5)  was  optimized. 

For  the  12,753,313  column  problem  provided  by  American,  duplicates  were  removed  from 
each  of  the  51  sections  reducing  the  size  of  each  section  to  an  average  size  of  167,645  columns. 
This  process  consumed  212  CRAY  Y-MP  CPU  seconds,  plus  an  estimated  103  additional 
overhead  seconds  for  input  of  the  binary  data  files  and  output  of  the  binary  non-duplicate 
files,  as  described  below.  Table  3  presents  statistics  for  the  sequence  of  5  problems  generated 
from  the  initial  data  that  were  used  for  the  tests  in  this  study  and  the  names  we  assigned  for 
reference  to  each  problem.  The  value  s  is  the  number  of  sections  used  to  solve  the  problem. 
Note  that  the  sizes  of  the  problems  are  not  exact  multiples  of  1,000,000  because  of  the  way 
the  original  data  sets  were  provided.  In  particular,  one  of  the  original  files  contained  only 
1,999,998  columns,  rather  than  the  expected  2,000,000. 

The  size  of  each  section  was  chosen  large  enough  to  get  good  vectorization  performance 
during  the  pricing  operation  and  small  enough  so  as  not  to  consume  too  much  memory 
when  loaded  from  external  storage.  Each  section  of  nonduplicate  columns  is  stored  in  an 
external  file.  On  the  CRAY  Y-MP,  experiments  were  conducted  with  three  choices  for  the 
locations  of  the  external  files.  The  first  choice  was  a  UNICOS  5.0  file  system,  the  second  a 
UNICOS  6.0  file  system,  and  the  third  a  solid-state  storage  device  (SSD)  that  essentially  is 
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a  RAM-disk.  It  was  found  that  using  SSD  to  store  the  external  files  produced  a  negligible 
overhead  for  the  sifting  procedure,  and  hence  all  further  results  are  reported  assuming  SSD 
usage.  Each  section  of  nonduplicate  files  is  stored  as  a  binary  file  on  the  SSD.  The  values 
ri£,  ikj  (Section  3.1)  and  the  cost  vector  c  are  stored  as  binary  vectors  in  these  files. 

There  is  one  final  issue  to  discuss  in  connection  with  nonduplicates.  Since  no  dupli¬ 
cates  were  initially  found  across  section  boundaries,  duplicate  columns  were  removed  from 
W  before  the  problem  (5)  was  solved.  Hence,  the  effective  number  of  columns  added  is 
unpredictable  and  decreases  as  s  increases.  As  a  consequence,  the  following  instrumentation 
was  added.  On  the  first  iteration  of  the  sifting  procedure,  V/s  columns  are  chosen  from 
each  section.  Duplicate  columns  are  removed  from  V  and  the  number  of  new  nonduplicate 
columns  added,  p1,  is  computed.  The  value  r  =  p1  /t1  is  then  determined.  Successive  target 
sizes  tk  for  k  >  1  are  then  adjusted  to  new  values  tk  =  tk  jr  to  regulate  the  growth  in  W.  On 
each  further  iteration,  tk/s  columns  are  then  chosen  from  each  section  using  the  procedure 
described  in  Section  3.2. 


4  Using  CPLEX 

To  test  the  application  of  the  simplex  method  in  the  sifting  procedure,  the  CPLEX  op¬ 
timizer  [3],  version  2.0,  was  used.  Much  effort  has  been  put  into  CPLEX  to  get  good 
vectorization  performance  from  various  aspects  of  the  code.  Some  modifications  of  CPLEX 
were  made  for  this  application.  To  conserve  approximately  6  percent  of  CPU  time,  routines 
accessing  the  matrix  Aw  assumed  that  nonzeroes  in  the  matrix  had  the  value  1.  In  addition, 
because  the  bases  of  the  problem  have  extremely  dense  factorizations,  the  refactorization 
interval  was  set  at  175. 

Extensive  experimentation  with  CPLEX  on  various  versions  of  these  problems  reveal 
that  steepest-edge  pricing  is  more  appropriate  than  standard  reduced-cost  pricing  for  both 
the  primal  and  dual  simplex  methods.  At  the  beginning  of  the  sifting  procedure,  the  initial 
feasible  solution  consisting  of  the  first  1000  columns  is  highly  degenerate  and  no  starting 
basis  is  available.  However,  the  basis  consisting  of  the  m  artificial  variables  is  dual  feasible, 
making  the  dual  simplex  method  effective  for  solving  the  first  few  subproblems.  Since  a 
large  number  of  columns  are  added  on  these  initial  iterations,  using  an  advanced  starting 
basis  with  the  dual  simplex  method  is  not  appropriate  due  to  the  large  dual  infeasibility 
incurred  by  the  addition  of  the  new  columns.  After  the  initial  degenerate  feasible  point  is 
passed  by  the  sifting  procedure,  the  optimal  bases  found  by  solving  (5)  are  less  degenerate, 
and  a  primal  simplex  method  using  the  previously  optimal  basis  works  well.  Note  that  this 
procedure  is  unstable  on  smaller  instances  of  the  problem  since  the  initial  degenerate  point 
may  not  be  passed.  In  this  case,  (5)  is  reoptimized  from  an  artificial  start  each  time,  and 
the  pricing  information  varies  due  to  multiple  dual  optimality,  as  discussed  in  Section  3.2. 

The  use  of  an  artificial  variable  basis  to  start  the  dual  simplex  algorithm  provides  some 
insight  about  the  dual  variables  7 r  when  using  the  simplex  algorithm  to  solve  the  subproblems. 
For  these  problems,  it  was  found  that  the  optimal  bases  tended  to  have  a  large  number  of 
artificial  variables,  which  are  degenerate.  (For  the  largest  problem  (13mil),  96  artificial 
variables  are  basic  at  the  optimal  solution.)  Artificial  variables  also  affect  the  prices  for  the 
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Problem 

Name 

Size  of  W 

Major 

Its. 

CPLEX 

Its. 

CPU  Time  (secs^ 

Max. 

Final 

Misc. 

Price 

Optim. 

Total 

lmil 

24,637 

7,778 

9 

11,411 

1.3 

3.0 

176.0 

179.0 

2mil 

24,911 

7,579 

9 

12,617 

1.4 

5.8 

199.7 

205.5 

4mil 

25,275 

14,256 

11 

16,769 

2.3 

13.5 

293.0 

308.8 

8mil 

26,539 

6,826 

12 

20,297 

1.9 

31.3 

325.6 

358.8 

13mil 

26,637 

7,556 

12 

19,396 

2.2 

47.4 

333.9 

383.5 

Table  4:  Results  with  CPLEX 

sifting  procedure.  If  the  artificial  variable  on  row  i  is  basic,  it  is  easy  to  see  that  7r*  =  0 
and  hence  row  i  contributes  nothing  to  the  selection  of  new  columns.  Thus,  it  is  worth 
considering  the  assignment  of  a  penalty  M  as  the  cost  of  each  artificial  variable,  as  in  the 
OBI  runs  (Section  5).  In  this  case,  the  artificial  variable  may  still  remain  basic  with  7r ;  =  M. 
For  the  simplex  method,  no  difference  was  found  in  the  performance  of  the  sifting  procedure 
when  a  penalty  M  was  used  as  compared  to  no  penalty. 

With  CPLEX,  the  deletion  strategy  aimed  at  keeping  the  subproblems  relatively  small.  A 
large  number  of  columns  requires  more  simplex  iterations  as  well  as  more  work  per  iteration 
since  full  pricing  is  being  used.  The  size  of  a  problem  was  limited  to  33,000  columns.  If 
a  problem  exceeded  that  size,  then  all  columns  j  €  W  with  a  reduced  cost  Cj  >100  were 
deleted.  (For  example,  in  Table  5  on  the  third  major  iteration,  18,666  colums  were  added  to 
a  problem  of  size  18,977.  Then  17,843  columns  were  removed  because  the  problem  was  too 
large.  This  was  followed  by  the  removal  of  8,884  duplicate  columns.)  Furthermore,  as  soon 
as  an  optimal  solution  to  (5)  was  found  with  a  smaller  objective  value  than  the  initial  feasible 
solution,  columns  were  purged  (using  the  same  rule).  Since  the  sifting  algorithm  used  the 
primal  simplex  algorithm  with  steepest-edge  pricing  after  this  point,  the  philosophy  is  to 
keep  the  subproblems  small. 

4.1  CPLEX:  Computational  Results 

The  sequence  (i1,  f2, . . . ,  t10)  was  taken  to  be  (16000, 16000, 8000, 8000, 4000, 2000, 2000, 1000, 
1000,500),  with  tk  =  1000  for  k  >  10.  After  removal  of  duplicates,  the  first  linear  program 
solved  in  each  case  had  851  columns.  Table  4  contains  the  summary  of  the  results  with 
CPLEX.  The  columns  indicating  the  size  of  W  represent  the  maximum  and  final  sizes  of 
the  linear  programs  (5)  that  were  solved.  The  number  of  major  iterations  is  the  number  of 
iterations  in  the  sifting  procedure.  The  number  of  CPLEX  iterations  is  the  total  number  of 
simplex  iterations  needed  to  solve  each  subproblem.  CPU  Time  represents  the  CRAY  Y-MP 
time  to  solve  the  problem,  broken  into  miscellaneous  time  (for  example,  the  time  to  remove 
duplicates  from  each  subproblem),  pricing  time  and  optimization  time.  The  pricing  time 
includes  the  time  required  to  add  the  columns  in  V  to  the  CPLEX  data  structures. 

Table  5  contains  an  iteration  log  for  solving  the  problem  13mil.  The  second  column 
indicates  the  size  of  W  on  each  iteration.  The  CPLEX  algorithm  is  the  dual  simplex  or 
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Major 

Iter 

Cols 
in  Prob 

CPLEX 

Alg. 

CPLEX 

iters 

Objval 

Optim. 

time 

Price 

time 

Cols 

added 

Cols 

purg. 

Dups 

del. 

1 

851 

■Sii 

108 

57448.000 

0.190 

5.960 

15963 

0 

9122 

2 

7692 

Si 

346 

57448.000 

1.500 

4.581 

37332 

7080 

18967 

3 

18977 

HH 

1654 

57448.000 

21.497 

4.625 

18666 

17843 

8884 

4 

10916 

Dual 

1544 

57448.000 

16.993 

4.795 

18666 

0 

8595 

5 

20987 

Dual 

3981 

57448.000 

69.532 

3.888 

9333 

0 

3683 

6 

26637 

Dual 

6218 

52035.965 

129.514 

3.584 

4641 

24725 

2267 

7 

4286 

Primal 

1829 

49835.305 

26.073 

3.529 

4641 

0 

2272 

8 

6655 

Primal 

1876 

48856.961 

35.269 

3.500 

2049 

0 

1309 

9 

7395 

Primal 

1356 

48438.438 

23.684 

3.466 

300 

0 

158 

10 

7537 

Primal 

432 

48400.691 

7.333 

3.288 

34 

0 

17 

11 

7554 

Primal 

50 

48400.129 

1.455 

3.100 

2 

0 

0 

12 

7556 

Primal 

2 

48400.129 

0.839 

3.076 

0 

0 

0 

Table  5:  Iteration  log  for  CPLEX  on  13mil 


primal  simplex  method,  each  with  steepest-edge  pricing.  The  number  of  CPLEX  iterations 
is  the  total  number  of  iterations  to  solve  the  corresponding  subproblems.  The  objective  value 
is  the  final  objective  value  of  the  subproblem  (5).  The  optimization  time  is  the  time  required 
by  CPLEX  to  solve  each  subproblem,  while  the  pricing  time  is  the  time  required  to  price  all 
12.75  million  columns  and  add  columns  to  the  CPLEX  data  structure. 

5  Using  OBI 

The  implementation  of  OBI  [9]  of  the  primal-dual  predictor- corrector  interior  point  method 
used  for  these  experiments  was  modified  to  take  advantage  of  the  special  characteristics  of 
the  problem.  In  each  iteration  of  the  interior  point  method,  it  is  necessary  to  compute  nTAw 
twice.  To  do  this  efficiently,  each  subproblem  was  preprocessed  to  order  the  columns  by 
column  count,  using  the  same  method  as  described  in  section  3.1.  In  addition,  all  references 
in  the  code  to  specific  nonzero  values  in  Aw  were  eliminated,  since  the  problem  is  assumed 
to  have  all  nonzero  values  in  Aw  equal  to  1. 

All  interior  point  methods  for  linear  programming  form  a  matrix  of  the  form  (A0AT) 
on  each  iteration,  where  0  is  some  diagonal  matrix.  This  step  is  followed  by  a  Cholesky 
factorization.  For  better  vectorization  performance,  it  was  found  that  reordering  Aw  to 
increase  the  sparsity  of  the  Cholesky  factor  L  was  not  worth  the  effort  as  {AwOA^}  is 
almost  totally  dense.  Hence,  reordering  was  not  used,  and  a  special  version  of  a  Cholesky 
factorization  that  assumed  a  dense  matrix  replaced  the  default  sparse  factorization  routines. 
This  factorization  routine  achieved  an  average  performance  on  the  CRAY  Y-MP  of  240 
Megaflops  (Millions  of  Floating  Point  Operations  per  Second)  on  a  single  processor  capable 
of  a  theoretical  maximum  of  333  Megaflops. 
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An  analysis  of  the  performance  of  OBI  solving  the  subproblems  revealed  that  the  forma¬ 
tion  of  (AwOA^)  was  requiring  a  significant  amount  of  the  processor  time.  An  analysis  of 
this  computation  led  to  an  algorithm  that  vectorizes  well  for  this  task.  For  this  discussion, 
let  h  represent  the  number  of  variables  in  the  linear  program  (5)  and  let  A  =  Aw.  Then 

ioiT=  £0,1,4  tin 

j= 1 

Note  that  the  nonzero  components  of  (AjAj)  are  exactly  equal  to  one.  The  matrix  (AoAT) 
is  symmetric  so  that  only  the  diagonal  and  the  lower  triangular  part  needs  to  be  computed. 
Suppose  Aj  has  lj  nonzeros  at  positions  ixj,  i2j, . . . ,  i^j.  The  diagonal  D  G  of  (AoAT) 
is  computed  as  follows: 

procedure  compute_diagonal: 

D  <-  0. 

for  j  —  1  to  n  do 

for  k  =  1  to  i j  do 

Dikj  <—  Dikj  +  Qj 

The  inner  loop  of  this  procedure  vectorizes  (although  with  short  vector  lengths,  adversely 
affecting  peak  performance)  since  the  values  iXj,  i2j , . . . ,  iej2  are  unique. 

To  efficiently  compute  the  lower  triangular  part  of  (A0AT),  an  address  list  similar  to  the 
one  described  by  Adler,  et.al.  [1]  is  used.  However,  the  nature  of  the  matrix  dictates  that 
the  “elementary  products”  are  easier  to  store  since  each  is  equal  to  1.0.  To  achieve  the  best 
possible  vectorization,  the  address  list  is  computed  in  a  special  way.  Let  Q  e  3£mxm  be  a 
two-dimensional  lower  triangular  matrix  with  a  zero  diagonal.  For  ix  >  i2,  define  Qili2  to 
be  the  number  of  columns  in  A  containing  both  i\  and  i2.  It  is  easy  to  see  that  the  lower 
triangle  of  Q  is  equal  to  the  lower  triangle  in  the  matrix  (AAT).  Let 

Q  =  max  {Qip2  :  1  <  i2  <  ix  <  m}  ,  (12) 

so  that  Q  is  the  maximum  element  in  Q.  Define  Tq,  for  1  <  q  <  Q,  by 

Tq=  £  I{Qili2>q),  (13) 

where  /  is  an  indicator  function  defined  by 

1(0  >  n)  =  l  1  ^'i*2  -  ^ 

_  Q)  |  0  otherwise 

Hence  Tq  is  a  count  of  the  number  of  elements  in  Q  greater  than  or  equal  to  q. 

The  address  list  R  consists  of  Q  sections,  each  with  Tq  ordered  triples,  1  <  q  <  Q.  The 
ordered  triples  ( iT,kT,jr )  in  section  Rq,  1  <  r  <  Tq,  are  defined  by  an  inductive  rule  such 
that  (fr,  kr,jr )  G  Rq  if  and  only  if  the  following  three  conditions  are  satisfied: 


(14) 


1-  Qirkr  q'l 
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2.  Air ir  =  1  and  Akrjr  =  1;  and, 

3.  ( ir,kr,jr )  i?4,  for  t  <  q. 

Hence,  R\  is  a  list  of  all  of  the  nonzero  positions  in  Q ,  with  each  element  of  Q  having  one  of 
the  columns  jr  that  contribute  to  it.  The  list  R2  is  a  list  of  all  of  the  nonzero  positions  that 
have  2  or  more  columns  contributing  to  Q,  using  a  different  column  from  the  one  present  in 
Ri. 

Given  the  address  lists,  it  is  easy  to  then  compute  the  lower  triangular  part  of  (AOAT) 
using  the  following  procedure: 

procedure  compute_AAT: 

(A0AT)jlt-2  <—  0  for  1  <  *2  <  ii  <  m. 

for  q  =  1  to  Q  do 

for  r  =  1  to  Tq  do 

{A@AT)irkr  _  (AeAT)lrtr  +  e,. 

The  inner  loop  of  this  procedure  vectorizes  well  and  substantially  reduces  the  time  it  takes  to 
compute  (AQAT).  Note  that  this  matrix  is  actually  stored  as  a  vector  and  that  the  location 
of  element  (fr,  kT )  is  easily  computed  since  (AQAT)  is  assumed  to  be  dense  and  is  stored  as 
a  long  vector.  In  fact,  the  address  list  stores  this  location.  If  W  has  n  columns,  then  the 
length  of  the  address  list  (the  size  of  R)  is  expected  to  be  Pn/2,  where  t  is  the  average  of 

^1 1  ^2 )  •  •  •  j  ^ n  • 

When  using  the  interior  point  method  within  sifting,  it  was  found  that  deleting  columns 
was  not  beneficial.  Deleting  many  columns  caused  most  of  those  columns  to  re-enter  the 
problem  on  a  later  iteration.  Hence,  growth  in  the  size  of  W  was  allowed  and  controlled. 
It  seems  that  the  dual  solution  n*  generated  on  a  particular  iteration  k  yielded  important 
information  regarding  the  dual  feasibility  of  the  problem.  Discarding  columns  that  forced 
this  dual  feasibility  to  be  maintained  was  not  useful  due  to  the  fact  that  the  dual  solution  is 
in  the  relative  interior  of  a  dual  face  of  the  subproblem.  This  contrasts  to  using  the  simplex 
method,  where  the  extreme  point  nature  of  the  dual  solution  seemed  to  allow  column  deletion 
without  any  difficulty. 

The  predictor-corrector  interior  point  method  was  started  with  a  primal  solution  of  Xj  — 
100.0  for  each  j  E  W,  with  a  dual  feasible  solution  of  7r  =  0  and  z  =  cw.  The  large  value  of  Xj 
improves  the  performance  of  the  predictor-corrector  method.  No  attempts  at  implementing 
a  restart  strategy  along  the  lines  of  Lustig,  et.al.  [10]  or  Mitchell  [13]  were  successful.  This 
is  viewed  as  an  avenue  for  future  research. 

Since  these  problems  are  highly  rank  deficient,  a  set  of  m  artificial  columns  with  a  cost 
of  10,000  were  included  in  the  problems  solved  by  OBI  on  each  iteration  of  the  sifting  pro¬ 
cedure.  While  the  presence  of  these  columns  assists  in  keeping  {AOAT)  better  conditioned, 
these  columns  also  influence  the  pricing  information  obtained  by  OBI.  As  remarked  earlier, 
the  presence  of  these  columns  did  not  affect  the  performance  of  the  sifting  procedure  with 
CPLEX. 
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Problem 

Name 

Final  Size 
of  W 

Major 

Its. 

OBI 

Its. 

CPU  Time  (secs^ 

Misc. 

Price 

Optim. 

Total 

lmil 

28,211 

8 

197 

1.5 

3.6 

230.2 

235.3 

2mil 

32,541 

8 

210 

1.7 

7.0 

257.8 

266.5 

4mil 

36,415 

8 

204 

1.9 

14.3 

251.6 

267.9 

8mil 

41,680 

9 

242 

2.4 

29.5 

306.6 

338.5 

13mil 

52,368 

9 

239 

3.0 

48.8 

326.7 

378.5 

Table  6:  Results  with  OBI 


Major 

Iter 

Cols 
in  Prob 

OBI 

iters 

Objval 

Optim. 

time 

Price 

time 

Cols 

added 

Dups 

del. 

1 

1688 

9 

57448.000 

8.528 

7.191 

16983 

10934 

2 

7737 

13 

57448.000 

13.317 

6.104 

16269 

7453 

3 

16553 

18 

57448.000 

19.467 

6.378 

30855 

13898 

4 

33510 

32 

52316.105 

41.113 

5.745 

30855 

12791 

5 

51574 

34 

48815.031 

48.720 

5.662 

1747 

1044 

6 

52277 

33 

48431.762 

48.084 

5.620 

244 

169 

7 

52352 

33 

48400.703 

50.498 

4.160 

26 

15 

8 

52363 

32 

48400.129 

46.224 

4.752 

37 

32 

9 

52368 

35 

48400.129 

50.736 

3.196 

0 

0 

Table  7:  Iteration  Log  for  OBI  solving  13mil 

5.1  OBI:  Computational  Results 

In  this  section,  computational  results  are  presented  for  the  sifting  procedure  using  OBI  to 
solve  the  subproblems.  The  values  used  were  t 1  =  17,000,  t 2  =  5800  and  tk  =  11000  for 
k  >  3.  Table  6  presents  the  results  for  the  sifting  procedure  using  the  interior  point  method 
to  solve  the  subproblems.  The  final  size  of  W  is  also  the  maximum  size  since  no  deletions 
were  done  on  any  of  the  problems.  The  number  of  major  iterations  remained  essentially 
constant.  The  times  represent  similar  values  as  in  Section  4.1. 

Table  7  illustrates  the  iteration  log  for  solving  the  problem  13mil  using  OBI  as  the 
solver.  It  is  interesting  to  note  that  the  sifting  procedure  with  OBI  requires  fewer  iterations 
to  get  past  the  objective  value  corresponding  to  the  initial  point.  The  prices  obtained  by 
OBI  are  insensitive  to  the  degeneracy  of  the  initial  problem,  although  they  are  sensitive  to 
the  initial  interior  point  used  by  the  primal-dual  algorithm.  The  time  to  solve  each  problem 
rises  with  the  number  of  columns  and  is  fairly  predictable.  The  pricing  time  per  iteration 
for  this  method  is  higher  due  to  a  higher  overhead  required  to  add  columns  to  the  OBI  data 
structures  than  the  CPLEX  data  structures. 
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6  A  Hybrid  Method 

A  careful  examination  of  the  above  computational  results  reveals  that  the  interior  point 
method  spends  a  predictable  amount  of  time  solving  each  subproblem.  It  performs  quite 
well  in  the  beginning,  but  towards  the  end,  the  same  amount  of  time  is  spent  to  solve  a 
new  subproblem  when  only  a  few  columns  are  added.  In  contrast,  the  simplex  method  has 
difficulty  making  progress  away  from  the  initial  highly  degenerate  solution,  but  as  fewer 
columns  are  added  after  this  point  is  passed,  the  simplex  method  spends  little  time  re¬ 
optimizing  the  new  problem  starting  from  an  optimal  basis  for  the  previous  subproblem. 
This  led  to  the  development  of  a  hybrid  procedure,  where  the  interior  point  method  solves 
5  linear  programs  and  is  then  followed  by  the  application  of  the  simplex  method.  In  effect, 
OBI  is  being  used  to  find  a  good  starting  point  for  the  simplex  method. 

A  difficulty  with  this  method  is  the  identification  of  a  starting  basis  for  the  simplex 
method  from  an  interior  point  solution.  If  all  columns  with  Xj  >  e  (where  e  is  a  suitably 
chosen  tolerance)  are  selected  to  be  basic  columns,  and  the  dual  solution  is  ignored,  too  much 
information  is  lost.  Essentially,  the  solution  provided  by  the  interior  point  method  indicates 
something  about  the  dual  feasibility  of  all  the  columns  in  W,  and  discarding  columns  for 
which  dual  feasibility  is  attained  does  not  work  well. 

Therefore,  the  procedure  described  by  Megiddo  [12]  was  used  to  identify  an  optimal 
basis.  The  implementation  of  this  procedure  is  described  by  Lustig  [8].  First,  the  procedure 
identifies  a  crash  basis,  leaving  all  nonbasic  variables  at  the  value  determined  by  the  interior 
point  method.  This  solution  is  purified  to  a  primal  optimal  solution  by  lowering  each  positive 
nonbasic  Xj  to  its  bound.  The  work  per  iteration  of  this  method  is  equivalent  to  the  work 
per  iteration  of  the  primal  simplex  method.  The  second  part  of  the  procedure  maintains  the 
dual  feasibility  of  the  interior  point  method  by  doing  a  similar  procedure  to  the  dual  linear 
program.  Each  of  these  iterations  is  equivalent  to  a  step  of  the  dual  simplex  method.  The 
procedure  is  guaranteed  to  converge  in  at  most  m  pivots.  For  the  results  reported  here,  the 
routines  implemented  by  Lustig  [8],  using  XMP  [11]  and  LA05  [15],  were  employed.  Both 
of  these  codes  are  not  suitable  for  good  vectorization.  Better  results  would  be  obtained  by 
implementing  this  crossover  algorithm  within  CPLEX. 

Purifying  the  primal  and  dual  optimal  solutions  in  this  way  maintained  dual  feasibility 
relative  to  the  columns  already  in  W.  All  columns  with  Cj  <  100,  j  G  W,  were  then  passed 
to  the  simplex  method,  effectively  paring  the  problem  down  to  an  acceptable  size.  This 
procedure  worked  well  and  substantially  reduced  the  total  computation  time. 

6.1  OBI  and  CPLEX:  Computational  Results 

In  this  section,  computational  results  are  presented  for  the  hybrid  scheme  described  above. 
For  each  problem  solved,  OBI  was  used  to  solve  the  first  5  subproblems,  with  t 1  =  25,500 
and  tk  =  11,000  for  2  <  k  <  4.  (Note  that  these  target  values  differ  from  those  used  in 
Section  5.1,  where  total  problem  growth  is  a  more  serious  consideration.)  After  the  fifth  linear 
program  was  solved,  the  crossover  scheme  was  invoked  followed  by  one  solve  of  a  small  linear 
program  via  CPLEX  to  get  an  initial  factorization.  After  this  initial  solve,  t6  ~  t7  —  800 
and  tk  =  400  for  k  >  8.  Note  that  t 5  is  considered  irrelevant  due  to  the  crossover  scheme. 
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Problem 

Name 

Size  of  W 

Major 

Its. 

OBI 

Its. 

Crossover 

Its. 

CPLEX 

Its. 

Last 

OBI 

First 

CPLEX 

Last 

CPLEX 

lmil 

33,456 

2457 

3899 

9 

119 

335 

228 

2mil 

37,185 

2984 

4250 

9 

116 

421 

358 

4mil 

41,479 

3539 

4592 

9 

116 

387 

478 

8mil 

42,176 

4228 

6774 

12 

126 

336 

1389 

13mil 

45,083 

5217 

6980 

10 

111 

327 

1268 

Table  8:  Statistics  for  Hybrid  Scheme 


Problem 

Name 

CPU  Time  (secs) 

Misc. 

Price 

OBI 

Cross 

CPLEX 

Total 

lmil 

1.3 

3.1 

145.4 

21.1 

8.7 

179.6 

2mil 

1.4 

6.0 

145.4 

22.9 

10.7 

186.4 

4mil 

1.6 

12.3 

146.8 

24.7 

13.8 

199.2 

8mil 

1.9 

31.9 

163.1 

25.9 

35.9 

258.7 

13mil 

2.0 

42.6 

140.3 

30.5 

33.6 

249.0 

Table  9:  CPU  Times  for  Hybrid  Scheme 

Table  8  shows  various  statistics  for  this  approach,  while  Table  9  shows  CPU  time.  In  Table 
8,  the  size  of  the  last  linear  program  solved  by  OBI  is  given,  along  with  sizes  for  the  first 
and  last  linear  programs  solved  by  CPLEX.  The  total  number  of  major  iterations  includes  5 
interior  point  iterations,  followed  by  1  crossover  iteration,  followed  finally  by  a  sequence  of 
CPLEX  solves.  In  Table  9,  the  miscellaneous  time  includes  miscellaneous  tasks  such  as  the 
deletion  of  duplicates  before  each  subproblem  is  solved.  The  pricing  time  is  the  time  to  price 
out  all  of  the  columns  including  the  time  required  to  add  the  columns  to  the  appropriate 
data  structures  for  OBI  or  CPLEX.  The  crossover  time  is  the  time  used  by  the  0B1/XMP 
routines  to  compute  an  optimal  basis.  The  final  CPLEX  time  is  the  time  used  by  CPLEX 
to  finish  the  procedure. 

Table  10  gives  an  iteration  log  for  solving  the  problem  13mil  with  the  hybrid  scheme. 
After  OBI  solves  the  first  5  subproblems,  the  crossover  procedure  is  invoked.  The  crossover 
requires  only  327  iterations,  but  as  previously  noted,  these  iterations  are  expensive  because 
of  the  lack  of  vectorization  in  XMP  and  LA05.  CPLEX  is  then  handed  a  small  problem  and 
an  optimal  basis.  This  basis  is  factorized  and  checked  for  optimality.  CPLEX  then  solves 
the  remainder  of  the  subproblems. 

The  computational  performance  of  the  hybrid  scheme  is  clearly  superior  to  the  pure  inte¬ 
rior  point  or  pure  simplex  procedures  and  could  be  further  improved  by  doing  the  crossover 
procedure  within  CPLEX.  Moreover,  the  hybrid  procedure  is  clean  and  a  more  robust  prod- 
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Major 

Iter 

Cols 
in  Prob 

Method 

iters 

Objval 

Optim. 

time 

Price 

time 

Cols 

added 

Dups 

del. 

1 

1688 

mnm 

9 

57448.000 

8.549 

7.262 

25449 

14638 

2 

12499 

■31 

14 

57448.000 

15.324 

6.041 

25857 

12135 

3 

26221 

24 

57448.000 

28.582 

6.493 

25857 

12774 

4 

39304 

33 

50734.910 

44.399 

13018 

7239 

5 

45083 

■H 

31 

48530.461 

43.398 

0 

0 

* 

45083 

Crossover 

327 

48530.461 

30.538 

6 

5217 

CPLEX 

0 

48530.461 

3.105 

3.555 

1832 

1000 

7 

6049 

CPLEX 

297 

48510.117 

9.572 

3.532 

1798 

1022 

8 

6825 

CPLEX 

648 

48418.199 

13.754 

3.571 

382 

238 

9 

6969 

CPLEX 

316 

48400.129 

6.244 

3.259 

18 

7 

10 

6980 

CPLEX 

7 

48400.129 

0.972 

3.145 

0 

0 

Table  10:  Iteration  log  for  problem  13mil  with  hybrid  scheme 

uct.  It  is  not  subject  to  the  unpredictable  behavior  of  the  simplex  method  in  the  earlier 
“degenerate”  phase,  and  is  not  sensitive  to  the  total  number  of  iterations  at  the  end  in 
the  same  way  as  OBI,  where  an  additional  pass  can  cost  almost  1  minute  of  additional 
computation  time. 
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